
Cocojunk
🚀 Dive deep with CocoJunk – your destination for detailed, well-researched articles across science, technology, culture, and more. Explore knowledge that matters, explained in plain English.
Forth (programming language)
Read the original article here.
Forth: A Deep Dive into a Stack-Oriented Language for Building from Scratch
Forth is a unique programming language and development environment that holds significant relevance for anyone interested in the fundamental principles of computing, interacting directly with hardware, and understanding how software systems can be built from the ground up, particularly in resource-constrained environments. Designed by Charles H. "Chuck" Moore and first used by others in 1970, Forth offers a perspective vastly different from most mainstream languages.
Originally, the name was intended to be "FOURTH" (signifying a "fourth-generation" language for "third-generation" hardware, aiming to leapfrog existing tools), but was shortened to "FORTH" due to a five-character filename limit on the IBM 1130 system where it was first developed. While not an acronym, the name is sometimes still seen in all caps.
Forth's design prioritizes interactivity, small size, and extensibility, making it an ideal subject for exploring concepts vital to understanding how to build systems without relying on large, complex operating systems or extensive standard libraries.
Core Concepts of Forth
At its heart, Forth is built around a few fundamental ideas:
Words: All operations, data, and control flow structures in Forth are defined as "words." A Forth program is essentially a sequence of words.
Definition: In Forth, a word is the fundamental unit of code or data. It's a named routine, variable, constant, or control structure that can be executed by the Forth interpreter or compiled into a program. Think of words as analogous to functions, procedures, or commands in other languages.
The Stack: Forth is stack-oriented. Parameters for words are passed implicitly on a data stack, and results are returned on the same stack. This mechanism is central to how Forth programs are structured and how words communicate.
Definition: A stack is a Last-In, First-Out (LIFO) data structure. In Forth, the data stack is used to hold numbers and other values that words operate on. Values are pushed onto the stack and popped off the top by words.
Reverse Polish Notation (RPN): Because parameters are taken from the stack before the operation is specified, Forth code is written in Reverse Polish Notation (also known as postfix notation).
Definition: Reverse Polish Notation (RPN), or postfix notation, is a mathematical notation in which every operator follows all of its operands. For example,
2 3 +
means "push 2, push 3, then add the top two stack items (2 and 3), leaving the result (5) on the stack." This contrasts with infix notation (2 + 3
).RPN makes parsing the language much simpler, as the interpreter doesn't need complex rules for operator precedence or parentheses. It also aligns naturally with stack operations.
Example: Arithmetic in RPN
Let's calculate
(25 * 10) + 50
using RPN and trace the stack:25
: Pushes 25 onto the stack. Stack:[ 25 ]
10
: Pushes 10 onto the stack. Stack:[ 25, 10 ]
*
: The multiplication word. It pops the top two values (10 and 25), multiplies them (250), and pushes the result back. Stack:[ 250 ]
50
: Pushes 50 onto the stack. Stack:[ 250, 50 ]
+
: The addition word. It pops the top two values (50 and 250), adds them (300), and pushes the result back. Stack:[ 300 ]
.
: The dot word. It pops the top value (300) and prints it. Stack:[ ]
The printed output would be
300
.Interactive Development: Forth typically includes an integrated command shell. Users can define new words, test them immediately, and redefine them without a lengthy compile-link cycle. This interactive, incremental approach is highly productive for development, especially when working directly with hardware.
Why Forth for "Building From Scratch"?
For someone exploring low-level systems and building computing environments from their foundations, Forth offers several compelling advantages and insights:
- Minimal Footprint: Early Forth systems were incredibly small, fitting a complete development environment (compiler, editor, user programs) into the limited memory of 8-bit systems. This demonstrates how powerful functionality can be achieved with minimal resources.
- Direct Hardware Access: Forth is designed to be close to the hardware. Its simplicity and extensibility make it relatively easy to write words that interact directly with memory, I/O ports, and other hardware features, which is essential when building bare-metal systems.
- Simplicity of Core: The core of a Forth interpreter and compiler is surprisingly simple to implement. This makes it feasible to port Forth to new architectures or even build a basic Forth system yourself, embodying the "building from scratch" philosophy.
- Extensibility: The entire language, including the compiler and control structures, is built from Forth words. This allows programmers to extend the language itself, creating domain-specific words tailored to a particular application or hardware. It acts as a "meta-application language."
- Historical Relevance: Forth has a history of being the first software running on new chips, used in boot loaders, embedded systems, and applications requiring direct control and efficiency in limited environments. Studying Forth provides context for how software was developed before the dominance of large operating systems and complex toolchains.
Structure of the Forth Language
The structure of Forth reflects its design goals of simplicity and extensibility.
The Dictionary
The core data structure in Forth is the dictionary. This is where all defined words are stored and looked up by the interpreter.
Definition: The dictionary in Forth is the primary data structure that holds all defined words. It acts as a symbol table, mapping the name of a word (a string) to its executable code or data structure in memory.
The dictionary is typically structured as a tree or, more commonly, a linked list of linked lists. When the interpreter encounters a word, it searches the dictionary, usually starting with the most recently defined words. If the word is found, the associated code is executed.
Each entry in the dictionary, representing a defined word, generally consists of two parts: the head and the body.
- Head: Contains information needed by the compiler and interpreter during dictionary lookups, such as the word's name (Name Field, NF) and a pointer to the previously defined word (Link Field, LF) to form the linked list structure.
- Body: Contains the executable code or data associated with the word. It includes the Code Field (CF), which points to the code that runs when the word is executed, and the Parameter Field (PF), which may contain data or pointers to other words in the case of high-level definitions.
The separation of head and body is significant, especially in cross-compilation (building code for one system on another). The head might reside on the compiling host, while the body is sent to the target system, reducing the memory footprint on the target if interactive development isn't needed there.
Code Structure: Threaded Code
Historically, a key implementation technique for Forth was threaded code.
Definition: Threaded code is a code representation where the "instructions" are not CPU instructions but pointers (or addresses) to executable code sequences (often other Forth words). Executing threaded code involves a small inner interpreter that follows these pointers.
Instead of compiling directly to machine code, a Forth word defined in Forth (a "colon definition") would be compiled into a sequence of addresses pointing to the words it calls. Executing the compiled word involved the inner interpreter reading these addresses one by one and executing the corresponding words.
There are variations:
- Indirect Threaded Code (ITC): The compiled word contains addresses of Code Fields (CFs) of the words it calls. The inner interpreter fetches the CF address, then jumps to the code pointed to by the CF. This was common in early FIG-Forth.
- Direct Threaded Code (DTC): The compiled word contains addresses directly pointing to the executable code of the words it calls. The inner interpreter simply jumps to the address. This is slightly faster as it saves one indirection step.
- Subroutine Threaded Code (STC): The compiled word consists of native CPU subroutine call instructions targeting the code of the words being called. This is faster but relies more on native call/return mechanisms.
Threaded code offered excellent code density (small size) and portability (the inner interpreter was the only piece that needed to be rewritten for a new CPU). Modern Forths often compile directly to optimized native machine code for maximum performance, but the concept of threaded code remains fundamental to understanding traditional Forth implementation and its small size.
Working with Forth: The Interpreter and Compiler
Forth systems operate in two main states: interpretation and compilation.
Definition: Interpretation state: The Forth system reads input words and executes them immediately. Compilation state: The Forth system reads input words and appends their execution semantics to the definition of a new word currently being compiled.
When you type into a Forth console, the interpreter is running. Its basic algorithm is simple:
- Read input from the user (typically a line).
- Parse the input into space-delimited words.
- For each parsed word:
- Look up the word in the dictionary.
- If found:
- If in interpretation state, execute the word's interpretation semantics.
- If in compilation state, execute the word's compilation semantics (typically, append its execution code/pointer to the current definition).
- If not found, try to convert the word to a number.
- If successful, push the number onto the data stack.
- If conversion fails, report an error ("word not recognised").
Defining New Words
The word :
(colon) is a parsing word that starts the definition of a new word. It reads the next word from the input stream as the name of the word being defined, creates a dictionary entry for it, and switches the system into compilation state.
The word ;
(semi-colon) is another parsing word that ends the current definition, appends the necessary 'return' code, and switches the system back to interpretation state.
Example: Defining a simple word
: DOUBLE ( n -- 2n ) DUP + ;
:
: Start defining a new word.DOUBLE
: The name of the new word.( n -- 2n )
: This is a comment explaining the stack effect. It means the word expects one numbern
on the stack and leaves one number2n
on the stack. Comments in parentheses are ignored by the compiler/interpreter.DUP
: Duplicate the top stack item. Ifn
is on the stack, it becomes[ n, n ]
.+
: Add the top two stack items. Popsn
andn
, pushes2n
. Stack:[ 2n ]
.;
: End the definition and return to interpretation state.
Now, when you type 5 DOUBLE .
the interpreter:
- Pushes
5
: Stack[ 5 ]
- Finds
DOUBLE
, which is a compiled word. It executes the code compiled by:
.- Inside
DOUBLE
,DUP
runs: Stack[ 5, 5 ]
- Inside
DOUBLE
,+
runs: Stack[ 10 ]
- Inside
- Finds
.
, executes it: Pops10
and prints10
. Stack[ ]
.
Mixing Compilation and Interpretation
While usually in compilation state between :
and ;
, you can temporarily switch back to interpretation state using [
(left bracket) and return to compilation with ]
(right bracket). This is useful for calculating a value during compilation and embedding the result directly into the compiled code using LITERAL
.
Example: Embedding a calculated value
Suppose you want to embed the ASCII value of 'Q' in a word.
: EMIT-Q [ CHAR Q ] LITERAL EMIT ;
:
: Start definingEMIT-Q
. Enter compilation state.[
: Switch to interpretation state.CHAR Q
:CHAR
is a parsing word that reads the next word (Q
), takes its first character's ASCII value (81), and pushes it onto the data stack. Stack (during interpretation):[ 81 ]
]
: Switch back to compilation state.LITERAL
: This is a word executed during compilation. It takes the value currently on the data stack (81) and compiles code intoEMIT-Q
's definition that will, whenEMIT-Q
is executed, push that value (81) onto the data stack again.EMIT
: This word compiles code that will, whenEMIT-Q
is executed, take the value from the data stack (81) and output the corresponding character ('Q').;
: End definition.
When EMIT-Q
is later executed, the LITERAL
part pushes 81 onto the stack, and EMIT
uses it to print 'Q'.
A common shortcut is the [CHAR]
word, which is an immediate word (explained below) version of CHAR
followed by LITERAL
.
: EMIT-Q [CHAR] Q EMIT ; \ Uses [CHAR] shortcut
Here, [CHAR] Q
executes immediately during compilation, pushes 81 onto the compilation-time data stack, and then compiles the literal 81 into the definition, followed by the compilation of EMIT
. The \
indicates a comment to the end of the line.
Immediate Words
Some words need to execute during compilation rather than being compiled for later execution. These are immediate words.
Definition: An immediate word is a Forth word that, when encountered during compilation, executes its interpretation semantics immediately instead of compiling its compilation semantics into the current definition.
Examples include ;
(which ends compilation) and [
. Control flow words like IF
, ELSE
, THEN
, BEGIN
, WHILE
, REPEAT
, DO
, LOOP
are also typically immediate. They execute during compilation to compile the correct branching and looping primitives (like BRANCH
and ?BRANCH
) based on the control flow structure. The data stack is used at compilation time to keep track of nested control structures and their branch addresses, which are 'back-patched' as the compiler encounters matching words (e.g., THEN
patches the address for IF
).
The IMMEDIATE
word is used to mark the most recently defined word as immediate. The POSTPONE
word allows you to force an immediate word to be compiled rather than executed immediately.
Data Objects in Forth
While Forth heavily relies on the data stack for temporary storage and parameter passing, it also supports named data objects. These are created using special defining words.
Definition: A defining word is a Forth word that, when executed, creates a new dictionary entry for another word (an "instance") and specifies the runtime behavior of that new word.
Examples of standard defining words include:
VARIABLE
: Creates a new word that, when executed, pushes the memory address of a single-cell (usually 4 or 8 bytes) storage location onto the stack. You use@
(fetch) to get the value from the address and!
(store) to write a value to the address.VARIABLE MY-VAR ( Creates MY-VAR ) 10 MY-VAR ! ( Stores 10 in MY-VAR ) MY-VAR @ . ( Fetches value from MY-VAR and prints it: 10 )
CONSTANT
: Creates a new word that, when executed, pushes a specific, unchanging value onto the stack.100 CONSTANT MAX-SIZE ( Creates MAX-SIZE ) MAX-SIZE . ( Prints the value: 100 )
CREATE
: Creates a new word that, when executed, pushes the memory address of a named location onto the stack. This location can then be used to store an array, a string, or other data structures using words likeALLOT
to reserve space.CREATE BUFFER 100 ALLOT ( Creates BUFFER and reserves 100 bytes ) BUFFER . ( Prints the starting address of BUFFER )
A key aspect of Forth's extensibility is the ability for programmers to define custom defining words. This allows creating specialized data types or structures with specific instance behaviors, such as a word that represents a hardware register and automatically handles the correct I/O operations when accessed.
Named data objects defined this way are typically global in scope. Forth encourages using the data stack for parameters and local variables, although some modern Forth implementations also provide explicit local variable syntax.
Forth is weakly typed; it does not enforce data types. It is the programmer's responsibility to use words that operate correctly on the data values present on the stack or at memory addresses. This provides flexibility but requires careful programming.
System-Level Considerations
Forth's design is heavily influenced by the need to run efficiently in minimal environments.
Operating System, Files, and Multitasking
Traditional Forth systems often ran "bare metal," without a host operating system. Instead of files, source code and data were stored in fixed-size disk blocks, accessed by their physical disk address using words like BLOCK
. This approach was necessary on early systems lacking file systems but has largely been replaced by using host operating system files in modern Forths.
Definition: In classic Forth systems, disk blocks were fixed-size units of storage (often 1KB) on disk used to store source code or data. The Forth system managed buffers in memory to hold blocks, automatically reading and writing them as needed, effectively acting as its own minimal file system layer.
Multitasking, typically a simple cooperative round-robin scheduler, is often built directly into the Forth system. A word like PAUSE
saves the current task's stacks and context, finds the next task, and restores its context. This low-overhead task switching makes Forth suitable for embedded systems needing concurrent processes, even on small microcontrollers.
Modern Forth systems running on hosts like Linux or Windows usually interface with the host OS's file system and may use the OS's multitasking features, integrating into the larger system environment.
Self-Compilation and Cross-Compilation
A powerful technique in Forth, particularly relevant to building systems from scratch, is self-compilation (or meta-compilation, though Forth programmers use the term slightly differently). A full Forth system can compile itself.
The basic idea is to redefine the few core words responsible for placing compiled bytes into memory (fetch
and store
, often internal primitives). These redefined words are directed to write the compiled output into a separate memory buffer or even send it out over a serial port. Then, the existing compiler and the rest of the Forth system are recompiled using these new fetch
and store
words. The result is a new version of the Forth system compiled and stored in the designated buffer or sent to the target.
Definition: Self-compilation (Meta-compilation) in Forth is the process where a running Forth system compiles a new version of itself. This is often achieved by temporarily redefining fundamental words that handle memory access to direct the output of the compiler to a different location.
Definition: Cross-compilation is the process of compiling code on one computer system (the host) to run on a different computer system (the target), often with a different architecture. Forth's self-compilation capabilities make it well-suited for building cross-compilers for embedded systems, where the target system might have very limited resources or no development tools itself.
For cross-compilation to an embedded target, the redefined fetch
and store
words might send bytes over a serial connection to the target hardware. The target only needs a minimal resident program capable of receiving bytes and executing a specified address – the absolute minimum "kernel" to bootstrap the Forth system. This highlights how little is truly required on the target to start running complex code, a key lesson in "building from scratch."
Real-World Applications (Revisited)
Forth's unique characteristics have led to its use in various domains, often where its size, speed, interactivity, and direct hardware control were critical:
- Space and Astronomy: Used by NASA and other organizations due to its reliability, compactness, and suitability for controlling instruments. The Philae spacecraft used Forth.
- Boot Loaders and Firmware: Open Firmware, a standard used by companies like Apple, IBM, and Sun, is based on Forth, providing an interactive environment for system initialization and diagnostics before the main OS loads.
- Embedded Systems: Ideal for microcontrollers and systems with limited memory where a full OS is not feasible or necessary. Rockwell produced microcontrollers with built-in Forth kernels.
- Hardware Bring-up: Because of its interactive nature and low-level access, Forth was often the first high-level tool used when porting software to new processor architectures (e.g., the Intel 8086, Macintosh 128K).
- Industrial Control: Used in systems requiring real-time control and interaction with hardware.
- Early Software Development: Popular in the early microcomputer era (Atari, Jupiter ACE) due to memory constraints. Used for various applications, including video games (Electronic Arts titles like Starflight) and business software (RapidFile, VP-Planner).
- Special Effects: Famously used to control the Dykstraflex motion control camera system for Star Wars: Episode IV – A New Hope, allowing precise, repeatable camera movements essential for combining live-action and model shots. Its real-time capabilities and interactive nature were key here.
Notable Implementations
While the core concepts remain similar, numerous Forth implementations exist, adapting the language to different platforms and purposes. Some modern, ANS Forth compliant examples include Gforth (a portable, open-source option), SwiftForth, and VFX Forth (both focusing on native code generation), and implementations for microcontrollers like noForth. Open Firmware is a significant application built on Forth.
Further Exploration
Studying Forth can also lead to exploring hardware designed specifically for executing Forth code. Processors like the RTX2010 execute Forth operations directly in hardware, bypassing the need for an inner interpreter or traditional CPU instruction set, representing an ultimate convergence of language and architecture.
This resource aims to provide a structured understanding of Forth, emphasizing the aspects that make it valuable for someone interested in the low-level details of computing and the art of building systems from scratch. Its simplicity, stack-based nature, interactivity, and historical context offer a powerful lens through which to view fundamental software and hardware interactions.